The Lextype DB: A Web-Based Framework for Supporting Collaborative Multilingual Grammar and Treebank Development

نویسندگان

  • Chikara Hashimoto
  • Francis Bond
  • Dan Flickinger
چکیده

We have constructed a web-based framework for collaborative multilingual grammar and treebank development in which developers are distributed around the world. It is important for developers of the world-wide collaboration to i) grasp and share the big picture of the grammar and treebank of each language and ii) understand commonalities of languages. Our framework, the Lextype DB, describes lexical types of the grammar and treebank. Lexical types can be seen as detailed parts-of-speech and are the essence for the two important points just mentioned. Information about a lexical type that the Lextype DB provides includes its linguistic characteristics; examples of usage from a treebank; the way it is implemented in a grammar; and correspondences to major computational dictionaries. It consists of a database management system and a web-based interface, and is constructed semiautomatically. Currently, we have applied the Lextype DB to grammars and treebanks of Japanese and English.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Treebank-Based Multilingual Unification-Grammar Development

Broad-coverage, deep unification grammar development is time-consuming and costly. This problem can be exacerbated in multilingual grammar development scenarios. Recently (Cahill et al., 2002) presented a treebank-based methodology to semi-automatically create broadcoverage, deep, unification grammar resources for English. In this paper we present a project which adapts this model to a multilin...

متن کامل

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

A Multilingual SPARQL-Based Retrieval Interface for Cultural Heritage Objects

In this paper we present a multilingual SPARQL-based [1] retrieval interface for querying cultural heritage data in natural language (NL). The presented system offers an elegant grammar-based approach which is based on Grammatical Framework (GF) [2], a grammar formalism supporting multilingual applications. Using GF, we are able to present a cross-language SPARQL grammar covering 15 languages a...

متن کامل

Treebank-Based Acquisition of Multilingual Unification Grammar Resources

Deep unification(constraint-)based grammars are usually hand-crafted. Scaling such grammars from fragments to unrestricted text is time-consuming and expensive. This problem can be exacerbated in multilingual broad-coverage grammar development scenarios. Cahill et al. (2002, 2004) and O’Donovan et al. (2004) present an automatic f-structure annotation-based methodology to acquire broad-coverage...

متن کامل

The Spanish DELPH-IN grammar

In this article we present a Spanish grammar implemented in the Linguistic Knowledge Builder system and grounded in the theoretical framework of Head-driven Phrase Structure Grammar. The grammar is being developed in an international multilingual context, the DELPH-IN Initiative, contributing to an open-source repository of software and linguistic resources for various Natural Language Processi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007